Design and analysis of static memory management policies for CC-NUMA multiprocessors
نویسندگان
چکیده
In this paper, we characterize the performance of three existing memory management techniques, namely, buddy, round-robin, and first-touch policies. With existing memory management schemes, we find several cases where requests from different processors arrive at the same memory simultaneously. To alleviate this problem, we present two improved memory management policies called skew-mapping and prime-mapping policies. By utilizing the properties of skewing and prime, the improved memory management designs considerably improve the application performance of cache coherent non-uniform memory access multiprocessors. We also re-evaluate the performance of a multistage interconnection network using these existing and improved memory management policies. Our results effectively present the performance benefits of different memory management techniques based on the sharing patterns of applications. Applications with a low degree of sharing benefit from the data locality provided by first-touch. However, several applications with significant sharing degrees as well as those with single processor initialization routines benefit highly from the intelligent distribution of data provided by skew-mapping and prime-mapping schemes. Improvements due to the new schemes are found to be as high as 35% in stall time. 2002 Elsevier Science B.V. All rights reserved.
منابع مشابه
Design and Evaluation of a Switch Cache Architecture for CC-NUMA Multiprocessors
ÐCache coherent nonuniform memory access (CC-NUMA) multiprocessors provide a scalable design for shared memory. But, they continue to suffer from large remote memory access latencies due to comparatively slow memory technology and large data transfer latencies in the interconnection network. In this paper, we propose a novel hardware caching technique, called switch cache, to improve the remote...
متن کاملDesign and Evaluation of a Switch
Cache coherent non-uniform memory access (CC-NUMA) multiprocessors provide a scal-able design for shared memory but they continue to suuer from large remote memory access latencies due to comparatively slow memory technology and data transfer latencies in the in-terconnection network. In this paper, we propose a novel hardware caching technique, called switch cache, to improve the remote memory...
متن کاملPerformance Evaluation of Memory Allocation Schemes on CC-NUMA Multiprocessors
{ Cache Coherent Non-Uniform Memory Access (CC-NUMA) architectures have received strong interests from both academia and industries. This paper studies the performance impact of design choices at diierent levels of address and memory mapping on CC-NUMA architectures. Through execution-driven simulations of ve numerical programs, we nd close interactions between data allocation, global address t...
متن کاملDistributed Array Data Management on NUMA Multiprocessors
Management of program data to reduce false sharing and improve locality is critical for scaling performance on NUMA multiprocessors. We use HPF-like directives to partition and place arrays in data-parallel applications on Hector, a shared-memory NUMA multiprocessor. We present experimental results that demonstrate the magnitude of the performance improvement attainable when our proposed array ...
متن کاملDesign alternatives for shared memory multiprocessors
In this paper, we consider the design alternatives available for building the next generation DSM machine (e.g., the choice of memory architecture, network technology, and amount and location of per-node remote data cache). To investigate this design space, we have simulated five applications on a wide variety of possible DSM architectures that employ significantly different caching techniques....
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Journal of Systems Architecture
دوره 48 شماره
صفحات -
تاریخ انتشار 2002